Goto

Collaborating Authors

 Alliance


Data Science Education in Undergraduate Physics: Lessons Learned from a Community of Practice

arXiv.org Artificial Intelligence

It is becoming increasingly important that physics educators equip their students with the skills to work with data effectively. However, many educators may lack the necessary training and expertise in data science to teach these skills. To address this gap, we created the Data Science Education Community of Practice (DSECOP), bringing together graduate students and physics educators from different institutions and backgrounds to share best practices and lessons learned from integrating data science into undergraduate physics education. In this article we present insights and experiences from this community of practice, highlighting key strategies and challenges in incorporating data science into the introductory physics curriculum. Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.


Neural Approaches to Entity-Centric Information Extraction

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) has huge impact on our daily lives with applications such as voice assistants, facial recognition, chatbots, autonomously driving cars, etc. Natural Language Processing (NLP) is a cross-discipline of AI and Linguistics, dedicated to study the understanding of the text. This is a very challenging area due to unstructured nature of the language, with many ambiguous and corner cases. In this thesis we address a very specific area of NLP that involves the understanding of entities (e.g., names of people, organizations, locations) in text. First, we introduce a radically different, entity-centric view of the information in text. We argue that instead of using individual mentions in text to understand their meaning, we should build applications that would work in terms of entity concepts. Next, we present a more detailed model on how the entity-centric approach can be used for the entity linking task. In our work, we show that this task can be improved by considering performing entity linking at the coreference cluster level rather than each of the mentions individually. In our next work, we further study how information from Knowledge Base entities can be integrated into text. Finally, we analyze the evolution of the entities from the evolving temporal perspective.


Machine Learning Technique Predicting Video Streaming Views to Reduce Cost of Cloud Services

arXiv.org Artificial Intelligence

Video streams tremendously occupied the highest portion of online traffic. Multiple versions of a video are created to fit the user's device specifications. In cloud storage, Keeping all versions of frequently accessed video streams in the repository for the long term imposes a significant cost paid by video streaming providers. Generally, the popularity of a video changes each period of time, which means the number of views received by a video could be dropped, thus, the video must be deleted from the repository. Therefore, in this paper, we develop a method that predicts the popularity of each video stream in the repository in the next period. On the other hand, we propose an algorithm that utilizes the predicted popularity of a video to compute the storage cost, and then it decides whether the video will be kept or deleted from the cloud repository. The experiment results show a cost reduction of the cloud services by 15% compared to keeping all video streams.


Determining Sentencing Recommendations and Patentability Using a Machine Learning Trained Expert System

arXiv.org Artificial Intelligence

This paper presents two studies that use a machine learning expert system (MLES). One focuses on a system to advise to United States federal judges for regarding consistent federal criminal sentencing, based on both the federal sentencing guidelines and offender characteristics. The other study aims to develop a system that could prospectively assist the U.S. Patent and Trademark Office automate their patentability assessment process. Both studies use a machine learning-trained rule-fact expert system network to accept input variables for training and presentation and output a scaled variable that represents the system recommendation (e.g., the sentence length or the patentability assessment). This paper presents and compares the rule-fact networks that have been developed for these projects. It explains the decision-making process underlying the structures used for both networks and the pre-processing of data that was needed and performed. It also, through comparing the two systems, discusses how different methods can be used with the MLES system.